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Abstract 

Background: Mastitis is the most costly disease for dairy production, and control of the disease is often difficult, 
due to its multi-factorial nature. Susceptibility to mastitis is under partial genetic control and the industry uses 
indirect selection for decreased concentrations of somatic cells in milk to reduce mastitis. 

Methods: A genome-wide scan was performed to identify genomic regions associated with deregressed estimated 
breeding values (EBVs) for somatic cell counts (SCC) in Holstein bulls. In total 1183 proven bulls of the Italian of 
Holstein population, were genotyped with the BovineSNP50 BeadChip (lllumina, San Diego, CA) and a whole 
genome association analysis was performed using the R package GenABEL. 

Results: Two chromosomal regions showed association with SCC, a region on chromosome 14 with high 
significance (P < 5x1 0" 6 ) and a region on chromosome 6 with moderate significance (P < 5x1 0" 5 ). 

Conclusions: Two regions with effects on SCC have been identified with good statistical support. A further study 
of these candidate regions will be performed to verify the results and identify the causal mutations. 



Background 

Mastitis, an inflammation of the mammary gland caused 
by an infection with a range of bacteria, is the most 
costly disease for dairy production. Control of mastitis is 
difficult due to its multi-factorial nature. Susceptibility 
to mastitis is under partial genetic control and the 
industry uses selection on a correlated trait (somatic 
cells score in milk), to reduce mastitis incidence in the 
population. Over the last few years, several studies have 
identified genetic loci putatively associated with somatic 
cell counts or clinical mastitis [1,2]. The availability of 
the bovine genome sequence and high density genotyp- 
ing panels of single nucleotide polymorphisms has 
allowed a considerable number of bulls worldwide to be 
genotyped for genomic evaluation and selection. 
Furthermore, this information can be used to perform 
association studies with high precision at genome-wide 
level. The work reported here used genotypic data from 
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the genomic selection project to perform a genome- 
wide scan with the objective of identifying genomic 
regions associated to deregressed estimated breeding 
values (DR-EBVs) for somatic cell counts (SCC) in Hol- 
stein bulls. 

Methods 

Animals 

The bulls chosen for the genome wide association study 
were selected from among the 3155 animals progeny 
tested in Italy with DNA samples available. All these 
bulls will be used by the Italian National breeders asso- 
ciation of Holstein Frisian Cattle (ANAFI) to perform 
national genomic evaluations. 

Selection criteria used for association studies were 
intended to obtain: i) bulls with high selection index 
reliability (PFT > 0.75%); and ii) as low relationships 
between animals in the dataset as possible by trying to 
keep as many families (father - son couples) as possible. 
Among the 3155 bulls with biological material available 
2109 bulls had appropriate criteria to be included in the 
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study, 1183 of which had been already genotyped with 
the Bovine 50K SNP chip (Illumina Inc, San Diego). 

Phenotype: deregressed EBV for SCC 

The EBV for SCC had a mean of 98.73 ± 5.3 for the 
2109 bulls, and a mean 98.77 ± 6.3 in the cohort of 
1183 animals included in the study. Furthermore, dereg- 
ressed EBVs (DR-EBVs) had mean of 0 and a standard 
deviation of 5. The DR-EBVs and reliabilities for 
somatic cell counts were derived from a reduced animal 
model for single records on a single trait. The strategy 
used to estimate the deregressed estimates was a simpli- 
fied version of the algorithm of Jamrozik et al. [3], 
appropriate to a single trait reduced animal model. 

Statistical analysis 

Genome-wide association analysis was performed with 
the GenABEL package in R using a three step GRAM- 
MAR-CG approach, (Genome wide Association using 
Mixed Model and Regression - Genomic Control) [4,5]. 
Uncorrected p-values of P < 5 x 10" 7 were accepted to 
represent very strong proof of genome-wide association, 
while p-values between 5 x 10" 7 and 5 x 10" 5 were con- 
sidered as moderately significant associations. 

Genotyping and quality control filters 

A total of 1183 progeny tested bulls were genotyped 
with the BovineSNP50 BeadChip (Illumina, San Diego, 
CA). Genotype quality assurance was performed within 
the R statistical environment using the GenABEL pack- 
age ("check.marker" function) [6]. SNPs were checked 
for marker call rate (>5%) and minor allele frequency 
(<5%): markers missing 5% of data and with MAF of 
less than 5% were removed. Genotyping efficiency of 
samples was also verified, thus, samples with more than 
5% missing data were removed. Classical Multi Dimen- 
sion Scaling (MDS) was used to explore population sub- 
structure and to verify the genetic homogeneity of the 
dataset prior to analysis. 

Results and discussion 

Quality control 

Following quality control checks, 641 markers were 
excluded because of low call rate and 11404 markers 
were excluded because of low minor allele frequency. 
Furthermore, markers on the sex chromosomes were 
removed from the analysis. A total of 8 samples were 
removed because of low call rate and other 2 were 
eliminated because of high autosomal heterozygosity 
(FDR < 1%). Mean heterozygosity of the dataset after 
quality check was 0.33 ± 0.01, while the samples 
removed had heterozygosity higher than 0.63, indicating 
possible sample contamination. No samples were 
removed due to high IBS (Identity By State). Mean IBS 



was 0.70 ± 0.01, based on 2000 autosomal markers, 
while the threshold for IBS was set to > 0.95. No out- 
liers were identified by Classical Multi Dimension Scal- 
ing (MDS). 

After quality controls, the final dataset used in the fol- 
lowing association analysis contained 1173 samples and 
41209 Genome wide SNPs. 

Association analysis 

Two chromosomal regions showed associations with 
SCC, a region on chromosome 14 with high significance 
(P < 5xl0" 6 ) and a region on chromosome 6 with mod- 
erate significance (P < 5xl0~ 5 ). These two chromosome 
regions should be further tested to confirm these asso- 
ciations and to potentially identify the causative varia- 
tions that affect this trait. 

A recent review of QTL reported on chromosome 14 
[7] identified 10 QTL for disease traits as mastitis, 
seven of which were related to somatic cell score [8-11]. 
Interestingly the SNP identified in this study and 
located on chromosome 14, is actually within 1Mb from 
the QTL identified by Kaupe et al. [7] to be associated 
to somatic cell score, but significantly distant from all 
other chromosomal regions that harbor QTL for clinical 
mastitis [7]. 

Furthermore Nilsen et al, [2] characterized a region of 
chromosome 6 in which QTL for clinical mastitis had 
been identified, and found the Mucin 7 gene to be sig- 
nificantly associated with the trait. Mucin7 is located 
close to the casein cluster on chromosome 6. However, 
the SNP located on chromosome 6 obtained in this 
study is more than 2 Mb distant from the casein cluster, 
indicating that different genes could be involved. To 
confirm the results found in the current study, both 
SNP identified will be tested in a second independent 
set of animals. 
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